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GENERATING PARTIALS FOR PERSPECTIVE CORRECTED TEXTURE 
COORDINATES IN A FOUR PIXEL TEXTURE PIPELINE 

BACKGROUND OF THE INVENTION 



1. Technical Field: 

The present invention relates to computer graphics. 
10 More specifically, the present invention relates to the 
selection of pixel density and image texture. 

2. Description of Related Art: 

In computer graphics, the purpose of determining 

15 partials is to find the rate of change inside the texture 
map with respect to the current pixel being processed. 
Partials are the partial derivatives of the texture 
coordinates with respect to the screen coordinates. A 
texture map is an image stored in memory that is applied 

20 on a per-pixel basis to rendered primitives (graphics 
elements that are used as building blocks for creating 
images, such as a point, line, arc, cone or sphere) . 
These partials are used to pick a Level of Detail (LOD) 
which has the closest rate of change to one. 

25 Triangle partials are used by the rasterizing 

hardware to step in the x and y direction to render 
pixels to the frame buffer. Perspective correction 
generates nonlinear changes across the texture map so 
that the resulting image is in accordance to the view 

30 point being rendered. Normally the partials match the 
triangle partials, but if the texture coordinates are 
perspective corrected this is not true and the hardware 
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has to perform a calculation on a per pixel basis to find 
the true partial. 

Therefore, it would be desirable to have a method 
for generating partials for perspective corrected texture 
coordinates which requires fewer multiplies than the 
traditional method of calculating the partials for each 
pixel separately. 
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SUMMARY OF THE INVENTION 

The present invention provides a method, program and 
5 apparatus for generating partial differential equations 
for perspective corrected texture coordinates in a 
computer graphics display. The present invention 
comprises calculating texture coordinates for four 
adjacent pixels and then determining the differences 
10 between the coordinates. A perspective correction factor 
is then calculated, which is multiplied by each 
coordinate difference . 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The novel features believed characteristic of the 
5 invention are set forth in the appended claims. The 

invention itself, however, as well as a preferred mode of 
use, further objectives and advantages thereof, will best 
be understood by reference to the following detailed 
description of an illustrative embodiment when read in 
10 conjunction with the accompanying drawings, wherein: 

Figure 1 depicts a pictorial representation of a 
data processing system in which the present invention may 
be imp 1 emen ted; 

Figure 2 depicts a block diagram of a data processing 
15 system in which the present invention may be implemented; 

Figure 3 depicts a schematic diagram illustrating 
an arrangement of pixels in accordance with the present 
invention; 

Figure 4 depicts a schematic diagram illustrating 
20 the definition of the deltas for adjacent pixels in 
accordance with the present invention; 

Figure 5 depicts a schematic diagram illustrating a 
hardware implementation for the partials in accordance 
with the present invention; and 
25 Figure 6 depicts a flowchart illustrating the 

process of hardware implementation for the partials in 
accordance with the present invention. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

With reference now to the figures and in particular 
5 with reference to Figure 1, a pictorial representation of 
a data processing system in which the present invention 
may be implemented is depicted in accordance with a 
preferred embodiment of the present invention. A 
computer 100 is depicted which includes a system unit 

10 110, a video display terminal 102, a keyboard 104, 

storage devices 108, which may include floppy drives and 
other types of permanent and removable storage media, and 
mouse 106. Additional input devices may be included with 
personal computer 100, such as, for example, a joystick, 

15 touchpad, touch screen, trackball, microphone, and the 

like. Computer 100 can be implemented using any suitable 
computer, such as an IBM RS/6000 computer or 
IntelliStation computer, which are products of 
International Business Machines Corporation, located in 

20 Armonk, New York. Although the depicted representation 
shows a computer, other embodiments of the present 
invention may be implemented in other types of data 
processing systems, such as a network computer. Computer 
100 also preferably includes a graphical user interface 

25 that may be implemented by means of systems software 

residing in computer readable media in operation within 
computer 100. 

With reference now to Figure 2, a block diagram of a 
data processing system is shown in which the present 
30 invention may be implemented. Data processing system 200 
is an example of a computer, such as computer 100 in 
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Figure 1, in which code or instructions implementing the 
processes of the present invention may be located. Data 
processing system 200 employs a peripheral component 
interconnect (PCI) local bus architecture. Although the 
5 depicted example employs a PCI bus, other bus 

architectures such as Accelerated Graphics Port (AGP) and 
Industry Standard Architecture (ISA) may be used. 
Processor 202 and main memory 204 are connected to PCI 
local bus 206 through PCI bridge 208. PCI bridge 208 also 

10 may include an integrated memory controller and cache 

memory for processor 202 . Additional connections to PCI 
local bus 206 may be made through direct component 
interconnection or through add- in boards. In the depicted 
example, local area network (LAN) adapter 210, small 

15 computer system interface SCSI host bus adapter 212, and 
expansion bus interface 214 are connected to PCI local bus 
206 by direct component connection. In contrast, audio 
adapter 216, graphics adapter 218, and audio/video adapter 
219 are connected to PCI local bus 206 by add- in boards 

20 inserted into expansion slots. Expansion bus interface 

214 provides a connection for a keyboard and mouse adapter 
220, modem 222, and additional memory 224. SCSI host bus 
adapter 212 provides a connection for hard disk drive 226, 
tape drive 228, and CD-ROM drive 230. Typical PCI local 

25 bus implementations will support three or four PCI 
expansion slots or add-in connectors. 

An operating system runs on processor 202 and is used 
to coordinate and provide control of various components 
within data processing system 200 in Figure 2. The 

30 operating system may be a commercially available operating 
system such as Windows 2000, which is available from 



7 

Docket NO.AUS920000612US1 

Microsoft Corporation. An object oriented programming 
system such as Java may run in conjunction with the 
operating system and provides calls to the operating 
system from Java programs or applications executing on 
5 data processing system 200. "Java" is a trademark of Sun 
Microsystems, Inc. Instructions for the operating system, 
the object-oriented programming system, and applications 
or programs are located on storage devices, such as hard 
disk drive 226, and may be loaded into main memory 204 for 

10 execution by processor 202. 

Those of ordinary skill in the art will appreciate 
that the hardware in Figure 2 may vary depending on the 
implementation. Other internal hardware or peripheral 
devices, such as flash ROM (or equivalent nonvolatile 

15 memory) or optical disk drives and the like, may be used 
in addition to or in place of the hardware depicted in 
Figure 2. Also, the processes of the present invention 
may be applied to a multiprocessor data processing 
system. 

20 For example, data processing system 200, if 

optionally configured as a network computer, may not 
include SCSI host bus adapter 212, hard disk drive 226, 
tape drive 228, and CD-ROM 230, as noted by dotted line 
232 in Figure 2 denoting optional inclusion. In that 

25 case, the computer, to be properly called a client 

computer, must include some type of network communication 
interface, such as LAN adapter 210, modem 222, or the 
like. As another example, data processing system 200 may 
be a stand-alone system configured to be bootable without 

30 relying on some type of network communication interface, 
whether or not data processing system 200 comprises some 
type of network communication interface. As a further 
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example, data processing system 200 may be a personal 
digital assistant (PDA) , which is configured with ROM 
and/or flash ROM to provide non-volatile memory for 
storing operating system files and/or user-generated 
5 data. 

The depicted example in Figure 2 and above-described 
examples are not meant to imply architectural 
limitations. For example, data processing system 200 also 
may be a notebook computer or hand held computer in 

10 addition to taking the form of a PDA. Data processing 
system 200 also may be a kiosk or a Web appliance. 
The processes of the present invention are performed by 
processor 202 using computer implemented instructions, 
which may be located in a memory such as, for example, 

15 main memory 204, memory 224, or in one or more peripheral 
devices 226-230. The present invention can be 
implemented in graphics adapter 218. 

It has become possible to process multiple texture 
pixels in a cycle with the advance of silicon technology. 

20 In a system that lights four texture pixels 

simultaneously, the partials can be calculated by taking 
the difference between the calculated texture coordinates 
at each of the four adjacent pixels and then multiplying 
each difference by a factor based on the perspective 

25 correction coordinate. This method requires fewer 

multiplies than the traditional method of calculating the 
partials for each pixel separately. 

In the following description, the term "multiply" 
refers to the process of multiplication between numbers. 

30 In hardware, a multiply would be replaced with a 

multiplication unit (i.e. floating point or fixed point 
multiplication unit) . The same applies to the terms 
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"subtracts" and "adds 7 '. These functions can be replaced 
by actual hardware units implementing the functions, or 
replaced in software with a line of code performing those 
functions . 

5 S, T, R, and Q are texture coordinates that are sent 

at each vertex of a triangle to index into a texture map. 
S, T, R, and Q as capital letters refer to the 
non-perspective corrected coordinates. 

The lower case s, t, r, and q are the perspective 
10 corrected version of the texture coordinates. They can 
be written as: 



15 



s - -§ eq. 1 



T 

t = 7j eq. 2 



r = -q eq. 3 

20 qr = \ eq. 4 

The standard method for calculating the partials is 
derived by taking the partial derivative of the 
perspective corrected coordinate with respect to the 
25 screen coordinate or J*-. 



Using eq. 1 this yields: 

ds_ _ JL(Jl 
dx " dx\Q, 



^i) eg. 5 



30 



Using the Quotient Rule of Derivatives: 
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The partials on the right hand side of eq. 6 are the 
triangle partials of the non-persepctive corrected 
texture coordinates. Eq. 6 can be reduced to: 

5 

ds_ _ J_(dS_ S dQ\ n 
dx ~ Q\dx Q dxj e( 3* 1 

Substituting in eq. 1 and eq. 4 generates the standard 
equation: 

10 

Eq. 9 is what is typically implemented in hardware 
applications. There are two multiplies and one subtract 
15 per partial. The actual value needed is J^, where u is 
an additional way for indexing the texture map that 
ranges from 0 to the width of the texture map vs. from 0 
to 1 with s. 

20 u = sw eq. 9 (where w is the width) 

Since w is constant across the texture map: 

du ds - « 

& = ^ W ec 2- 10 

25 This yields a final equation of: 

In some hardware systems the width, height and depth 
are restricted to values that are powers of two. In such 
30 a case, exponents can be added instead of doing a 

multiply. However, since more systems are moving to 
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non-power-of -two textures, and because the present 
discussion is only concerned with the generic set of 
equations for calculating partials, the width factor will 
be treated as a multiply. This yields a total of 72 
5 multiplies and 24 subtracts for calculating all the 
partials (3D included) for four pixels simultaneously. 

Referring now to Figure 3, a schematic diagram 
illustrating an arrangement of pixels is depicted in 
accordance with the present invention. With a texture 
10 engine that lights four pixels in a cycle, this new 
algorithm can be implemented. The pixels must be 
adjacent so that the deltas between x values and y values 
are always one. Figure 3 shows the arrangement. 



15 texture coordinates in the equations refer to the 
coordinates for the associated pixel in Figure 3. 

If perspective correction is off, it makes sense that: 



It should be noted that the subscripts on the 



du _ All 
dx ~ ax 



eq. 12 



Since, for this case, Ax=l(see Fig. 1) this can be 
rewritten: 



= U\-U{) 



eq. 13 



25 This, however, is not correct if the texture is 
perspective corrected. Again, since the width is 



constant : 



if = (s \ -sq)w 



eq. 14 



30 



Going back to eq. 1, si and sO can be found as: 
si = o7 eq. 15 
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so = ~q^ eq. 16 



In the rasterizing process, subsequent coordinates are 
found by adding the partials in the given directions. 
5 Therefore : 



Si=S 0 + f eq. 17 



10 



20 



£i=Go+U eq. 18 



Using eq. 14 the slope for s can be found. 



A£Q__ ^O+f _Sp_ 

15 It is obvious that eq. 19 does not match eq. 8, but if 
the error between them can be found, a new solution for 
-f^- can be generated using the error term. The error is 



eq\9~eq% 

errors - eq. 20 



or 



errors— % - I eq. 21 



25 Substituting the error into the equation produces the 
following equation: 



jVf__io_ 

QqKSx Qq dx) 
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Reduction of eq. 22 



Qo 



S 0 + dx 



dS_ S 0 8Q 
dx Qq dx 



So 



L dx Q 0 dx 



-1 



r s 



0+ 



dS_ 



ox J 



So 



dS_ s 0 dQ 
dx Qq dx 



-1 



G 0& 



dQ \ 



> + a* Q Q + "aT J J 



etc 6o & 



-1 



f n dS r, dS-\ I 



-1 



10 



goax-^o a* 



fio 



_ 8Q n 8S_ v 8S_ 



error - 



Qo 



1 eq. 23 



15 It should be noted that the form of this equation matches 
the original error equation (eq. 21) . Setting them equal 
produces : 



20 



Qo t 



3f« 1 



Theref ore : 

dQ \ 



dsp 
dx : 



A£0 

1 AX 



2o 



eq. 24 



eq. 25 
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Substituting eq. 18 into eq. 25: 

ds 0 ASQfQi\ 

aT=^r eq. 26 
or 



-^=-wq\Vo eq. 27 



Since, ultimately, is being sought, the equations 10 
10 and 14 can be used to substitute in and rewrite equation 
27 in terms of -fj-, which yields equation 28: 



du Q a« 0 o o 

"aT=— ?i9^o eq. 28 



15 Equation 28 has one subtract and two multiplies per 

partial. Each partial can be derived similarly to eq. 
28. In total, this yields 48 multiplies versus 72 
multiplies for the standard equations. 

Referring to Figure 4, a schematic diagram 

20 illustrating the definition of the deltas for adjacent 
pixels is depicted in accordance with the present 
invention. To define the rest of the partials all of the 
deltas for a group of four adjacent pixels must first be 
defined. Each number in Figure 4 represents a texture 

25 coordinate (u) . Figure 4 defines the differences between 
uO and ul, divided by the change in x as The same 

applies to the other changes depicted in Figure 4. -^r is 
defined as the difference between ul and u3 , divided by 
the change in y. As the change in x and change in y are 

30 always one (the pixels are adjacent) , Figure 4 defines 
equations 29 through 32. 
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Equations 29 through 32 show the equations for each 
of the deltas. 



AUq 

— = U\-uo eq. 29 
5 — = U3-u 2 eq. 3 0 

— = U2~uo eq. 31 

Mil _ _ 

— - U3-U) eq. 32 



From similar derivations of equation 2 8 the following set 
10 of equations (33 to 40) can be found. These equations 
describe how u is moving for all four pixels. The 
equations for the other texture components v and w are 
found exactly the same as u. Substitute v and w wherever 
there is a u produces the v and w equations. 



15 



25 



du 0 Aup 
dx 



■^qiqro eq. 33 



8x 



J^qoqri eq. 34 



on dl/2 Az/ 2 - 

20 liT = ^T?3tf/-2 eq. 3 5 



da 3 _ A»2 
dx 



to-qiqn eq. 3 6 



dug _ Afp 
6y 



jfqzqro eq. 37 



du\ Aw 



dy 



-ty-qiqri eq. 3 8 



ay 



fy-qoqr 2 eq. 39 



30 ~di = -^rqiq r i e <2* 40 

In the next equation for vO the delta is also multiplied 
by qlqrO : 

35 — = ~<^q r o eq. 41 

This holds true for all the partials of u, v and w. This 
means that in a hardware application eight factors can be 



16 

Docket NO.AUS920000612US1 



calculated that each of the deltas can be multiplied by 
correspondingly to get the final results. 

The final algorithm in equation form (eq. 42 to 73) 
is presented below. The algorithm uses 12 subtracts and 
5 32 multiplies, versus the 24 subtracts and 72 multiplies 
of the standard algorithm in the prior art. 

The factors : 
f*o = qiq^o eq. 42 

10 f Xl = q 0 qri eq. 43 

f* 2 = ^qrs eq. 44 



15 



25 



fjc 3 = q2qr 3 eq. 45 

f yo = ^qro eq. 46 



= Qaqri eq. 47 

20 f y2 = q 0 qr 2 eq* 48 

fy 3 = <3iqr 3 eq. 49 
The partials : 



duo A«0 r v n 

~dx~~ = ~&x~Jxo eq. 50 



dy ~~ ^yJyo ec i* D± 



30 ^ = ^/,o eq. 52 
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dvp _ Av 0 ^ 
dy ~~ AyJyo 



dw 0 _ Awq » 
dx ~ AxJ*0 



r dwQ Awq ~ 
^ dy ~ AyJyo 



du\ Aup ~ 



10 



15 



du\ Ami r 

~dy~ ~ AyJyi 



dx ~~ AxJXl 



dvi Avi p 

dy ~ ~Ay~Jy\ 



dw\ _ Awq „ 
dx ~ AxJxi 



dw\ Aw i j. 
dy ~ ~Ay~Jyi 



dU2 __ AU2 r 

dx ~~ AxJX2 



du2 _ Aug r 
dy ~ Ay hi 



dV2 _ AV2 r 

dx ~ AxJX2 



0< dV2 Avp ~ 

L ^ dy - AyJyi 



dw 2 _ A\V 2 r 

dx ~ Ax hi 



dwo _ Awq ~ 
dy ~ AyJyi 



20 



eq. 53 

eq. 54 

eq. 55 

eq. 56 

eq. 57 

eq. 58 

eq. 59 

eq. 60 

eq. 61 

eq. 62 

eq. 63 

eq. 64 

eq. 65 

eq. 66 

eq. 67 
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dl*3 _ Am 2 r 

dx ~ Ax JX3 



dy ~ Ayfy3 



eq. 68 
eq. 69 



dV3 _ AV2 r 

dx ~~ AxJ*3 



eq. 7 0 



dv3 _ Avi r 
dy ~ AyJy3 



eq. 71 



10 



dwT> Aw 2 r 

dx ~ Ax f*3 



dw$ Aw i p 
dy ~ A y Jy'3 



eq. 72 
eq. 73 



15 Referring now to Figure 5, a schematic diagram 

illustrating a hardware implementation for the partials 
is depicted in accordance with the present invention. 
The present system can be implemented by a series of 
subtracts and multiplies. The subtracts and first stage 

20 of multiplies can be done in parallel. The second stage 
contains only the remaining multiplies. The entire 
figure represents the hardware implementation of 
equations 33 through 40 and shows the number of 
multiplies and subtracts necessary for calculating the 

25 partials of the texture coordinate u for four pixels. 

Referring to Figure 6, a flowchart illustrating the 
process of hardware implementation for the partials is 
depicted in accordance with the present invention. The 
process flow begins by calculating the deltas (step 601) . 

30 This process uses a set of subtracts in parallel to 
create the deltas defined by equations 29 through 32. 
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Next, the correction factors are calculated (step 602) . 
The correction factors are a set of multiplies that take 
the incoming perspective coordinates and generate the 
factors that will be multiplied by the deltas (equations 
5 42 through 49) . In hardware, this occurs in parallel 
with the substraction calculations for the deltas (they 
are not dependent upon each other) . After the correction 
factors are calculated, they are then multiplied by the 
deltas to obtain the true partials (step 603) . 

10 It is important to note that while the present 

invention has been described in the context of a fully 
functioning data processing system, those of ordinary 
skill in the art will appreciate that the processes of 
the present invention are capable of being distributed in 

15 the form of a computer readable medium of instructions 
and a variety of forms and that the present invention 
applies equally regardless of the particular type of 
signal bearing media actually used to carry out the 
distribution. Examples of computer readable media 

20 include recordable- type media, such as a floppy disk, a 
hard disk drive, a RAM, CD-ROMs, DVD-ROMs, and 
transmission- type media, such as digital and analog 
communications links, wired or wireless communications 
links using transmission forms, such as, for example, 

25 radio frequency and light wave transmissions. The 
computer readable media may take the form of coded 
formats that are decoded for actual use in a particular 
data processing system. 

The description of the present invention has been 

30 presented for purposes of illustration and description, 
and is not intended to be exhaustive or limited to the 
invention in the form disclosed. Many modifications and 
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variations will be apparent to those of ordinary skill in 
the art. The embodiment was chosen and described in 
order to best explain the principles of the invention, 
the practical application, and to enable others of 
ordinary skill in the art to understand the invention for 
various embodiments with various modifications as are 
suited to the particular use contemplated. 



